-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Rewrite FixedBufferAllocator and rename to BumpAllocator #25372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I may have tried to bite off more than I can chew - sorry. I just noticed that EDIT: Turns out it was easier to adapt this to work at comptime than I thought. In any case, please let me know what I need to change or do here. |
8b41d59
to
5ca5710
Compare
it would make reviewing easier if you would rename the file in its own commit, without any other changes - then a useful diff is visible instead of [everything removed, new file] |
I like the spirit of this PR but I have some concerns. For starters, the name "savestate" bothers me because it does not conform to camelCase. It should be "saveState" in my opinion. For consistency, it would also make a lot more sense for "restore" to instead be "restoreState." I agree that the malicious free vulnerability is simply the programmer's fault, however, Andrew and Mlugg have both talked about their desire to add a debug-only reserved-capacity tracking field to ArrayList. I assume that a similar principle would apply to this situation as well. BumpAllocator could have a debug-only field that would allow the implementation to safety check the freeing behavior, making sure it stays within legal bounds. I am also questioning the overall design of saving/restoring states in the first place. In the case of ArenaAllocator, save/restore logic would be significantly more complicated, so methods for that are a good idea. But with BumpAllocator, can we not just expect the programmer to save a copy of the implementation struct as a snapshot, then overwrite the active one with the snapshot? The struct is so damn simple that I wonder if there's even a justified reason for adding methods to do this. Furthermore, saving a literal snapshot of the struct itself sounds a lot more type-safe to me than usize. If you really wanted to use an integer to track the snapshot, maybe we could steal Andrew's idea for Named Integers. He has a devlog on it (The Unreasonable Effectiveness of Naming Integers). Could be overkill but it's worth thinking about. |
The difference between the two is whether it is fixed size or grows dynamically. |
One thing I personally think is still overlooked, is the alignment correction and verification cost. Following code illustrates what I mean: // FixedBufferAllocator.zig
start: [*]u8,
end: [*]u8,
pub fn init(buffer: []u8) @This() {
return .{
.start = buffer.ptr,
.end = buffer.ptr + buffer.len,
};
}
pub fn alloc(
ctx: *anyopaque,
length: usize,
alignment: Alignment,
_: usize,
) ?[*]u8 {
const self: *@This() = @ptrCast(@alignCast(ctx));
const old_end = @intFromPtr(self.end);
if (old_end < length) return null;
const new_end = alignment.backward(old_end - length);
if (new_end < @intFromPtr(self.start)) return null;
self.end = @ptrFromInt(new_end);
return self.end;
}
// Calling free immediately after alloc does not guarantee the same
// state as before alloc, because it is impossible to tell how much
// additional memory was allocated for the alignment (without saving it).
pub fn free(ctx: *anyopaque, memory: []u8, _: Alignment, _: usize) void {
const self: *@This() = @ptrCast(@alignCast(ctx));
const old_end = self.end;
const mem_ptr = memory.ptr;
if (old_end != mem_ptr) return;
self.end = old_end + memory.len;
}
// You could implement resizing, but you need to copy memory...
pub fn resize(_: *anyopaque, _: []u8, _: Alignment, _: usize, _: usize) bool {
return false;
} Most notably, we don't need to use The only part of the program that does not benefit from this solution is (the very, very rare case) when resizing the most recently allocated memory is required. Here you'd always need to copy the memory content. However in my opinion, implementing smarter free and resize mechanisms is not something a bump allocator should invest in. It should be super fast at allocating and that's it. Btw, my comment about If you wanted to implement more capabilities you can for example introduce |
Initially I had opted for a more simple allocation approach, but current code in this PR would not work if I used On a side note, I don't know why the s390x tests are failing. |
Don't bother with comptime. The comptime memory is inherently flawed when it comes to dealing with pointers.
Program to showcase: const std = @import("std");
const BA = @import("BumbAllocatorOld.zig");
pub fn main() !void {
comptime {
var buffer: [@sizeOf(usize) * 5]u8 = undefined;
const buffer_slice: []u8 = &buffer;
var buffer_allocator = BA.init(buffer_slice);
var my_allocator_thingy = buffer_allocator.allocator();
for (0..128) |i| {
const test_ptr = try my_allocator_thingy.alloc(usize, 1);
test_ptr[0] = i;
const yessy = test_ptr[0];
_ = yessy;
my_allocator_thingy.free(test_ptr);
}
}
} |
…ter. Fix a bug where a very large allocation could overflow a usize, causing a false positive allocation. Provide limited comptime allocator support.
…feAllocator instead.
Thank you all for the feedback. I cleaned up my code, made sure renaming the FixedBufferAllocator.zig file was it's own commit, ran some benchmarks, and tried to document everything a bit more with comments. It was previously mentioned that "savestate" should be camel-cased. My opinion is that it shouldn't, given that "savestate" is not two words in modern usage: https://en.wiktionary.org/wiki/savestate I understand that a neologism may be undesirable for many people, though I felt the meaning fit pretty well. The core team can decide on what should happen here. Anyways, here are some benchmarks of the new bump allocator compared to the old. EDIT: @two-horned You also mentioned that allocating from the back can be more efficient, and simpler. I agree - thank you for the comment. I tested this idea, and concluded that while it is slightly more efficient, the benefits to being able to resize allocations (at least, the last allocation) for free to be well worth it. I'd love to see benchmarks proving me wrong though. |
5ca5710
to
53b991b
Compare
I like the recent changes and cleanup. Comments make it very clear what the code does. This might be an overly specific edge case that may never occur, but what happens if the The reason I used |
|
||
// Forward alignment is slightly more expensive than backwards alignment, | ||
// but in exchange we can grow our last allocation without wasting memory. | ||
const aligned = alignment.forward(@intFromPtr(self.bump)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the line i am talking about: might overflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review - really appreciate it!
Y'know, I hadn't even realized that Alignment.forward could overflow. It would likely mean that either the alignment is extremely large, or the buffer pointer is close to std.math.maxInt(usize)
. I don't think this case would happen nearly as often as the (already prevented) overflow with the aligned base pointer and length amount though, so if this bug were to occur, it would happen extremely infrequently. Alignment.forward
could check this and return a ?usize or something, but it's low level enough it is likely my responsibility to fix.
While this could be a simple fix (just inline the code and add overflow protection) - I'm not sure if that's the best approach. You see, for the buffer address to be close to std.math.maxInt(usize)
, it would mean we are on a really weird system. AFAIK, the maximum userspace address on linux is 0x0000_7FFF_FFFF_FFFF
, so we couldn't trigger this bug on linux. Supposing that we have a machine that addresses (close, or all of) the entirety of it's 64 bits, you'd have to wonder both what allocation would require such a large alignment to overflow this - or in the case of a theoretical machine where memory is addressed backwards (or some other nonsense) why they aren't using their own library (like one that grows downwards with it's allocations - as you mentioned before).
I remember that you can map custom addresses in MMAP, but I'm pretty sure restrictions apply there too that would stop you from creating a mapping that high up.
Based on this train of thought, I don't think we need to worry too much. I'd prefer to not add the safety check because in practice it either 1) wouldn't occur, or 2) only occur in custom hardware and custom OS, in which case the programmers would be avoiding the standard library. Oh yeah - and if we did put in this safety check, it would mean that all bump allocators are slower for everyone, at the benefit of the 1 or 2 people in the world that need this.
Any counter argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no counter argument. as I said it was just an overly specific edge case. because the previous version didn't have this behavior I thought it is worth at least mentioning
Recent tests failed, I will mark other potential overflow locations. |
|
||
// Only the last allocation can be freed, and only fully | ||
// if the alignment cost for it's allocation was a noop. | ||
if (memory.ptr + memory.len != self.bump) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory.ptr + memory.len
might overflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it would overflow, it wouldn't be valid to pass to this function in the first place, and would be safety checked in debug modes. I'm not too worried about this.
|
||
var old_bump: [*]u8 = @atomicLoad([*]u8, &self.bump, .seq_cst); | ||
while (true) { | ||
const aligned = alignment.forward(@intFromPtr(old_bump)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might overflow (maybe this is the error in the tests, idk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought a lot, and decided to give your original idea another try. ArrayList already allocates excess capacity, so I figure that a downward-allocating BumpAllocator is worth a shot. I suspect this PR will lay dormant until the stdlib is investigated in more depth (closer to 1.0) so I feel I have plenty of time.
I'm pretty sure the test failure is related to the comptime logic though - I completely forgot to check for OOM in the alloc function if we are in comptime 😰
It has been mentioned elsewhere that "FixedBufferAllocator" would be better off known as "BumpAllocator." The name change clearly suggests how it works internally, so people do not get confused about why it cannot free certain allocations.
This is an attempt to both rename and rewrite FixedBufferAllocator. The new implementation:
usize
to track allocator state (instead of three)FixedBufferAllocator
andArenaAllocator
#24701) for this new implementation viasavestate
andrestore
Because this struct no longer stores the absolute buffer base pointer, it is now vulnerable to malicious memory frees decrementing the bump pointer (aka
base
) below it's original value. I do not consider this an issue, because if you have an allocation which is out of bounds, it was never a valid allocation to begin with. This can be checked withstd.mem.ValidationAllocator
anyhow, and is the programmer's fault.The
reset
function can no longer be implemented the same way, because we have less state to work off of. It's usecase is satisfied by saving allocator state (viasavestate
) right after initialization, then usingrestore(state)
at some later point, wherereset
would have been. This can be reverted, but it would require us to store another usize in the type (bringing it back to the old size).Additionally, I took the liberty of marking the
threadSafeAllocator
function as deprecated, as it had limited functionality (it could not free any allocations), and was not used at all in the stdlib and compiler. We already have the wrapperstd.heap.ThreadSafeAllocator
which fully satisfies this usecase. Of course, this can be reverted if I was wrong to change this.The only breaking change that should exist in this PR is the removal of
reset
- which as said earlier, can be effectively replaced bysavestate
/restore
.Closes #25371